1 research outputs found

    An improved extrinsic monolingual plagiarism detection approach of the Bengali text

    Get PDF
    Plagiarism is an act of literature fraud, which is presenting others’ work or ideas without giving credit to the original work. All published and unpublished written documents are under the cover of this definition. Plagiarism, which increased significantly over the last few years, is a concerning issue for students, academicians, and professionals. Due to this, there are several plagiarism detection tools or software available to detect plagiarism in different languages. Unfortunately, negligible work has been done and no plagiarism detection software available in the Bengali language where Bengali is one of the most spoken languages in the world. In this paper, we have proposed a plagiarism detection tool for the Bengali language that mainly focuses on the educational and newspaper domain. We have collected 82 textbooks from the National Curriculum of Textbooks (NCTB), Bangladesh, scrapped all articles from 12 reputed newspapers and compiled our corpus with more than 10 million sentences. The proposed method on Bengali text corpus shows an accuracy rate of 97.31
    corecore